AITopics | intent prediction

Collaborating Authors

intent prediction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MINT-RVAE: Multi-Cues Intention Prediction of Human-Robot Interaction using Human Pose and Emotion Information from RGB-only Camera Data

Mohsen, Farida, Safa, Ali

arXiv.org Artificial IntelligenceSep-29-2025

Efficiently detecting human intent to interact with ubiquitous robots is crucial for effective human-robot interaction (HRI) and collaboration. Over the past decade, deep learning has gained traction in this field, with most existing approaches relying on multimodal inputs, such as RGB combined with depth (RGB-D), to classify time-sequence windows of sensory data as interactive or non-interactive. In contrast, we propose a novel RGB-only pipeline for predicting human interaction intent with frame-level precision, enabling faster robot responses and improved service quality. A key challenge in intent prediction is the class imbalance inherent in real-world HRI datasets, which can hinder the model's training and generalization. To address this, we introduce MINT-RVAE, a synthetic sequence generation method, along with new loss functions and training strategies that enhance generalization on out-of-sample data. Our approach achieves state-of-the-art performance (AUROC: 0.95) outperforming prior works (AUROC: 0.90-0.912), while requiring only RGB input and supporting precise frame onset prediction. Finally, to support future research, we openly release our new dataset with frame-level labeling of human interaction intent.

artificial intelligence, machine learning, sequence, (20 more...)

arXiv.org Artificial Intelligence

2509.22573

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving

Godbole, Mihir, Gao, Xiangbo, Tu, Zhengzhong

arXiv.org Artificial IntelligenceAug-12-2025

Understanding the short-term motion of vulnerable road users (VRUs) like pedestrians and cyclists is critical for safe autonomous driving, especially in urban scenarios with ambiguous or high-risk behaviors. While vision-language models (VLMs) have enabled open-vocabulary perception, their utility for fine-grained intent reasoning remains underexplored. Notably, no existing benchmark evaluates multi-class intent prediction in safety-critical situations, To address this gap, we introduce DRAMA-X, a fine-grained benchmark constructed from the DRAMA dataset via an automated annotation pipeline. DRAMA-X contains 5,686 accident-prone frames labeled with object bounding boxes, a nine-class directional intent taxonomy, binary risk scores, expert-generated action suggestions for the ego vehicle, and descriptive motion summaries. These annotations enable a structured evaluation of four interrelated tasks central to autonomous decision-making: object detection, intent prediction, risk assessment, and action suggestion. As a reference baseline, we propose SGG-Intent, a lightweight, training-free framework that mirrors the ego vehicle's reasoning pipeline. It sequentially generates a scene graph from visual input using VLM-backed detectors, infers intent, assesses risk, and recommends an action using a compositional reasoning stage powered by a large language model. We evaluate a range of recent VLMs, comparing performance across all four DRAMA-X tasks. Our experiments demonstrate that scene-graph-based reasoning enhances intent prediction and risk assessment, especially when contextual cues are explicitly modeled.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.1759

Country: Asia > Japan (0.28)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (0.79)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

GAIPAT -Dataset on Human Gaze and Actions for Intent Prediction in Assembly Tasks

Grand, Maxence, Pellier, Damien, Jambon, Francis

arXiv.org Artificial IntelligenceMar-14-2025

The primary objective of the dataset is to provide a better understanding of the coupling between human actions and gaze in a shared working environment with a cobot, with the aim of signifcantly enhancing the effciency and safety of humancobot interactions. More broadly, by linking gaze patterns with physical actions, the dataset offers valuable insights into cognitive processes and attention dynamics in the context of assembly tasks. The proposed dataset contains gaze and action data from approximately 80 participants, recorded during simulated industrial assembly tasks. The tasks were simulated using controlled scenarios in which participants manipulated educational building blocks. Gaze data was collected using two different eye-tracking setups -head-mounted and remote-while participants worked in two positions: sitting and standing.

assembly task, instruction, participant, (11 more...)

arXiv.org Artificial Intelligence

2503.11186

Country: Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.08)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.90)
Information Technology > Artificial Intelligence > Robots (0.72)
Information Technology > Artificial Intelligence > Vision (0.70)

Add feedback

ZIA: A Theoretical Framework for Zero-Input AI

De, Aditi, Labs, NeuroBits

arXiv.org Artificial IntelligenceFeb-22-2025

Zero-Input AI (ZIA) introduces a novel framework for human-computer interaction by enabling proactive intent prediction without explicit user commands. It integrates gaze tracking, bio-signals (EEG, heart rate), and contextual data (time, location, usage history) into a multi-modal model for real-time inference, targeting <100 ms latency. The proposed architecture employs a transformer-based model with cross-modal attention, variational Bayesian inference for uncertainty estimation, and reinforcement learning for adaptive optimization. To support deployment on edge devices (CPUs, TPUs, NPUs), ZIA utilizes quantization, weight pruning, and linear attention to reduce complexity from quadratic to linear with sequence length. Theoretical analysis establishes an information-theoretic bound on prediction error and demonstrates how multi-modal fusion improves accuracy over single-modal approaches. Expected performance suggests 85-90% accuracy with EEG integration and 60-100 ms inference latency. ZIA provides a scalable, privacy-preserving framework for accessibility, healthcare, and consumer applications, advancing AI toward anticipatory intelligence.

accuracy, theoretical framework, zia, (14 more...)

arXiv.org Artificial Intelligence

2502.16124

Country: Asia > India > Uttarakhand > Roorkee (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Architecture > Real Time Systems (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions

Imran, Ali, Beltrame, Giovanni, St-Onge, David

arXiv.org Artificial IntelligenceJan-7-2025

-- In industrial environments, predicting human actions is essential for ensuring safe and effective collaboration between humans and robots. This paper introduces a perception framework that enables mobile robots to understand and share information about human actions in a decentralized way. The framework first allows each robot to build a spatial graph representing its surroundings, which it then shares with other robots. This shared spatial data is combined with temporal information to track human behavior over time. A swarm-inspired decision-making process is used to ensure all robots agree on a unified interpretation of the human's actions. Results show that adding more robots and incorporating longer time sequences improve prediction accuracy. Collaborative robots are poised to become a cornerstone of Industry 5.0 [1], emphasizing human-centric design solutions to meet the flexibility demands of hyper-customized industrial processes [2]. Significant efforts have been directed toward identifying key enabling technologies to enhance robotic systems with advanced situational awareness and robust safety features for human coworkers. Two pivotal technologies stand out: individualized human-machine interaction systems that merge the strengths of humans and machines, and the application of AI to improve workplace safety [3].

information, prediction, robot, (16 more...)

arXiv.org Artificial Intelligence

2501.04193

Country:

North America > Canada (0.04)
Europe (0.04)

Genre: Research Report (0.70)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.66)

Add feedback

Auto-Intent: Automated Intent Discovery and Self-Exploration for Large Language Model Web Agents

Kim, Jaekyeom, Kim, Dong-Ki, Logeswaran, Lajanugen, Sohn, Sungryull, Lee, Honglak

arXiv.org Artificial IntelligenceOct-29-2024

In this paper, we introduce Auto-Intent, a method to adapt a pre-trained large language model (LLM) as an agent for a target domain without direct fine-tuning, where we empirically focus on web navigation tasks. Our approach first discovers the underlying intents from target domain demonstrations unsupervisedly, in a highly compact form (up to three words). With the extracted intents, we train our intent predictor to predict the next intent given the agent's past observations and actions. In particular, we propose a self-exploration approach where top-k probable intent predictions are provided as a hint to the pre-trained LLM agent, which leads to enhanced decision-making capabilities. Auto-Intent substantially improves the performance of GPT-{3.5, 4} and Llama-3.1-{70B, 405B} agents on the large-scale real-website navigation benchmarks from Mind2Web and online navigation tasks from WebArena with its cross-benchmark generalization from Mind2Web.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.22552

Country:

North America > United States > Virginia > Richmond (0.04)
North America > United States > New York (0.04)
North America > United States > Michigan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Bayesian Intention for Enhanced Human Robot Collaboration

Hernandez-Cruz, Vanessa, Zhang, Xiaotong, Youcef-Toumi, Kamal

arXiv.org Artificial IntelligenceSep-30-2024

Predicting human intent is challenging yet essential to achieving seamless Human-Robot Collaboration (HRC). Many existing approaches fail to fully exploit the inherent relationships between objects, tasks, and the human model. Current methods for predicting human intent, such as Gaussian Mixture Models (GMMs) and Conditional Random Fields (CRFs), often lack interpretability due to their failure to account for causal relationships between variables. To address these challenges, in this paper, we developed a novel Bayesian Intention (BI) framework to predict human intent within a multi-modality information framework in HRC scenarios. This framework captures the complexity of intent prediction by modeling the correlations between human behavior conventions and scene data. Our framework leverages these inferred intent predictions to optimize the robot's response in real-time, enabling smoother and more intuitive collaboration. We demonstrate the effectiveness of our approach through a HRC task involving a UR5 robot, highlighting BI's capability for real-time human intent prediction and collision avoidance using a unique dataset we created. Our evaluations show that the multi-modality BI model predicts human intent within 2.69ms, with a 36% increase in precision, a 60% increase in F1 Score, and an 85% increase in accuracy compared to its best baseline method. The results underscore BI's potential to advance real-time human intent prediction and collision avoidance, making a significant contribution to the field of HRC.

intent prediction, orientation, prediction, (14 more...)

arXiv.org Artificial Intelligence

2410.00302

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent Prediction

Gong, Jiahui, Ding, Jingtao, Meng, Fanjin, Chen, Guilong, Chen, Hong, Zhao, Shen, Lu, Haisheng, Li, Yong

arXiv.org Artificial IntelligenceAug-19-2024

Mobile devices, especially smartphones, can support rich functions and have developed into indispensable tools in daily life. With the rise of generative AI services, smartphones can potentially transform into personalized assistants, anticipating user needs and scheduling services accordingly. Predicting user intents on smartphones, and reflecting anticipated activities based on past interactions and context, remains a pivotal step towards this vision. Existing research predominantly focuses on specific domains, neglecting the challenge of modeling diverse event sequences across dynamic contexts. Leveraging pre-trained language models (PLMs) offers a promising avenue, yet adapting PLMs to on-device user intent prediction presents significant challenges. To address these challenges, we propose PITuning, a Population-to-Individual Tuning framework. PITuning enhances common pattern extraction through dynamic event-to-intent transition modeling and addresses long-tailed preferences via adaptive unlearning strategies. Experimental results on real-world datasets demonstrate PITuning's superior intent prediction performance, highlighting its ability to capture long-tailed preferences and its practicality for on-device prediction scenarios.

dataset, population-to-individual tuning framework, sequence, (14 more...)

arXiv.org Artificial Intelligence

2408.09815

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
North America > United States > New York > New York County > New York City (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
(9 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

IntentRec: Predicting User Session Intent with Hierarchical Multi-Task Learning

Oh, Sejoon, Bhattacharya, Moumita, Feng, Yesu, Lamkhede, Sudarshan

arXiv.org Artificial IntelligenceJul-25-2024

Recommender systems have played a critical role in diverse digital services such as e-commerce, streaming media, social networks, etc. If we know what a user's intent is in a given session (e.g. do they want to watch short videos or a movie or play games; are they shopping for a camping trip), it becomes easier to provide high-quality recommendations. In this paper, we introduce IntentRec, a novel recommendation framework based on hierarchical multi-task neural network architecture that tries to estimate a user's latent intent using their short- and long-term implicit signals as proxies and uses the intent prediction to predict the next item user is likely to engage with. By directly leveraging the intent prediction, we can offer accurate and personalized recommendations to users. Our comprehensive experiments on Netflix user engagement data show that IntentRec outperforms the state-of-the-art next-item and next-intent predictors. We also share several findings and downstream applications of IntentRec.

intentrec, prediction, recommendation, (16 more...)

arXiv.org Artificial Intelligence

2408.05353

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > California > Santa Clara County > Los Gatos (0.04)

Genre: Research Report (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)

Add feedback

Towards Reliable and Empathetic Depression-Diagnosis-Oriented Chats

Lan, Kunyao, Ming, Cong, Yao, Binwei, Chen, Lu, Wu, Mengyue

arXiv.org Artificial IntelligenceApr-7-2024

Chatbots can serve as a viable tool for preliminary depression diagnosis via interactive conversations with potential patients. Nevertheless, the blend of task-oriented and chit-chat in diagnosis-related dialogues necessitates professional expertise and empathy. Such unique requirements challenge traditional dialogue frameworks geared towards single optimization goals. To address this, we propose an innovative ontology definition and generation framework tailored explicitly for depression diagnosis dialogues, combining the reliability of task-oriented conversations with the appeal of empathy-related chit-chat. We further apply the framework to D$^4$, the only existing public dialogue dataset on depression diagnosis-oriented chats. Exhaustive experimental results indicate significant improvements in task completion and emotional support generation in depression diagnosis, fostering a more comprehensive approach to task-oriented chat dialogue system development and its applications in digital mental health.

dialogue, dialogue state, symptom, (15 more...)

arXiv.org Artificial Intelligence

2404.05012

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.67)

Add feedback